The Testing Column: Assessment Scales: What They Are, and What Goes Into Them

This article originally appeared in The Bar Examiner print edition, Summer 2023 (Vol. 92, No. 2), pp. 35–36.By Rosemary Reshetar, EdD header of an odometer-like scale ranging from red at bottom left to bright green at right

You may think of fish or mountains upon mention of scales and scaling, as the title of Dr. Mark A. Albanese’s December 2014 article in this magazine about scaling suggests,1 but there is terminology for scales, and scaling,2 that is specific to testing. In this column, I’ll cover this terminology, with the focus on scales.

For any test, the scores that are used to reflect examinee performance are referred to as scale scores. Most everyone becomes familiar early in their education with number correct and percent correct scores, often via classroom tests. These are examples of scales, which are based on raw scores (e.g., the number correct or number of points earned).

In addition, it is common to translate the percent correct score into grades of A, B, or C, and so on. For example, 90% or higher is an A, 80% or higher a B, and 70% or higher a C. Adding plus or minus symbols to such grades permits finer-grained levels if desired. In the example of a single classroom test, the number correct, the percent correct, and the grade are applicable only to that test and not to multiple test forms across time.

Scales and Standardized Tests

When we move from a classroom test setting to a standardized test setting,3 the need arises to compare scores over time and ensure that scores have the same meaning regardless of the test form taken. This is accomplished by designating and maintaining a primary score scale, which is also the scale on which scores are reported. Equating procedures are commonly used to equate the raw scores on any single form to the primary score scale. Equating is a statistical procedure that makes adjustments to examinees’ scores to compensate for variations in exam difficulty. Equating helps ensure that current examinees’ scores accurately reflect only their proficiency rather than any potential differences in the difficulty of the questions they answered. In combination with an equating process, score scales enable the comparison of individuals who take different forms of the test. An assessment may have a single reported score scale, as the Multistate Bar Examination (MBE) does, or multiple reported scales, as the SAT and ACT do.

Let’s put this in the context of the MBE. A raw score on the MBE is all the correct answers on the exam out of 175. As Dr. Albanese notes, “raw scores on the MBE tell you little.”4 (NCBE discontinued reporting raw MBE scores to jurisdictions as of the February 2014 administration.) Albanese states that the MBE score generated from the statistical process of equating ensures that the scores have consistent meaning across administrations. The performance information provided for the MBE is a scaled score that can range from about 40 (low) to 200 (high). This is incorporated into the Uniform Bar Examination (UBE) total scores, which are reported on a 400-point scale. Bar admissions stakeholders are familiar with these scales, and the meaning of points along them. For the UBE, the points of great interest for jurisdictions and examinees are each jurisdiction’s minimum passing score, which ranged from 260 to 273 for the February 2023 administration. Many non-UBE jurisdictions use a similar procedure of scaling their written components to the MBE scores to obtain a reported bar exam score.

Some other large-scale assessments, such as higher education admissions exams, also have scales with which examinees and stakeholders are familiar. The SAT reports scores on a scale of 200–800 points for each section (Math and Evidence-Based Reading and Writing).5 The ACT reports scores on a scale of 1–36 points for the composite score and for the scores on each of the four tests that make up the ACT.6 The LSAT scale ranges from 120 to 180.7 Most test takers are familiar with the scales for the admission test they are taking and have an idea of what constitutes a good or great score.

Setting a New or Revised Scale

When a new or significantly revised standardized assessment is developed, an immediate psychometric concern is to develop the primary score scale(s) that will be used. With the transition to the NextGen bar exam, scale development is top of mind for the NCBE psychometric team, as is avoiding confusion about which scale we are talking about: the one used for the current UBE or for the NextGen exam. There are practical and technical considerations, and we will use an established set of principles related to the accuracy of the measurement to determine the best score scale. A primary consideration is score precision. This is the ability to support the reported score scale across multiple forms given the number of items or raw score points. In a simplified example, a reported score where the number of reported score points is much greater than the number of raw score points on any single form is untenable. In this case the differences in the reported score scale may seem large but do not represent much difference in demonstrated ability. In the context of the NextGen exam, where the use case is pass/fail scores, precision of scores and the distinction between scores in the passing score ranges is a major factor in scale development.

Notes

Mark A. Albanese, PhD, “The Testing Column: Scaling: It’s Not Just for Fish or Mountains,” 83(4) The Bar Examiner 50–56 (December 2014). (Go back)
For the UBE, MEE and MPT scores are scaled to the MBE. The scaling process is covered in earlier Testing Columns, e.g., Susan M. Case, PhD, “The Testing Column: Demystifying Scaling to the MBE: How’d You Do That?” 74(2) The Bar Examiner 45–46 (May 2005); Albanese, supra note 1. (Go back)
The Standards for Educational and Psychological Testing define standardized testing as follows: “When directions, testing conditions, and scoring follow the same detailed procedures for all test takers, the test is said to be standardized. Without such standardization, the accuracy and comparability of score interpretations would be reduced.” American Educational Research Association (AERA), American Psychological Association (APA), and National Council on Measurement in Education (NCME), Standards for Educational and Psychological Testing (AERA, 2014), p. 111, available at https://www.testingstandards.net/open-access-files.html. (Go back)
Mark A. Albanese, PhD, “Raw Scores on the MBE Tell You Little—and Probably Less Than You Think,” 83(1) The Bar Examiner 56–58 (March 2014). (Go back)
College Board, “SAT, Understanding Scores” (2022), available at https://satsuite.collegeboard.org/media/pdf/understanding-sat-scores.pdf. (Go back)
ACT, ACT Test Scores: Understanding Your Scores, available at https://www.act.org/content/act/en/products-and-services/the-act/scores/understanding-your-scores.html. (Go back)
LSAC, LSAT Scoring, available at https://www.lsac.org/lsat/lsat-scoring#:~:text=The%20LSAT%20scale%20ranges%20from,being%20the%20highest%20possible%20score. (Go back)

Portrait Photo of Rosemary Reshetar Rosemary Reshetar, EdD, is the Director of Assessment and Research for the National Conference of Bar Examiners.